Code
library(tidyverse)
library(here)
library(knitr)
library(ggdist)
library(readxl)library(tidyverse)
library(here)
library(knitr)
library(ggdist)
library(readxl)We are going to load the raw data collected by Ph.D. student Valentin Moser and research assistants during June 2021 in eight different beaver dammed streams (Figure 1). They have sampled in two locations: the main pond created by the beaver (i.e. pool factor level within the column location)) and 500 meters upstream of that pond (i.e. control factor level within the column location).
In the folder data/raw you will find: data_arthropods_flying.xlsx
Flying arthropods were collected by traps like this one:
d <- readxl::read_xlsx(path = here("data", "raw","data_arthropods_flying.xlsx"), sheet = 1) |>
select(-c(sort, remarks))|> # remove unnecesary column
filter( !is.na(size) ) # remvoe NAs observations
head(d)# A tibble: 6 × 14
site location date class order suborder family subfamily juvenile
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <lgl> <dbl>
1 Chrie Control June Arachnida Araneae <NA> <NA> NA 0
2 Chrie Control June Arachnida Araneae <NA> <NA> NA 0
3 Chrie Control June Arachnida Araneae <NA> <NA> NA 0
4 Chrie Control June Arachnida Araneae <NA> <NA> NA 0
5 Chrie Control June Insecta Coleoptera Adephaga Dytis… NA 0
6 Chrie Control June Insecta Coleoptera Phytophaga Curcu… NA 0
# ℹ 5 more variables: terrestrial <dbl>, juv.aquatic <dbl>, unusable <dbl>,
# size <dbl>, Laufnummer <dbl>
data_dict <- data.frame(
Column_name = c(
"site", "location", "date", "class", "order", "suborder", "family",
"subfamily", "juvenile", "terrestrial", "juv aquatic", "unusable",
"size", "remarks"
),
Explanation = c(
"Study system in which the individual was sampled",
"Site at which individual was sampled (i.e. pool or control)",
"Collection period during which the individual was sampled (i.e. June or July)",
"Taxonomic class the sampled individual belongs to",
"Taxonomic order the sampled individual belongs to",
"Taxonomic suborder the sampled individual belongs to (where available)",
"Taxonomic family the sampled individual belongs to (where available)",
"Taxonomic subfamily the sampled individual belongs to (where available)",
"Binary value indicating if the sampled individual was a juvenile. 1 = juvenile, 0 = adult",
"Binary value indicating if the sampled individual was winged. 1 = non-winged, 0 = winged",
"Binary value indicating if the sampled individual belongs to a taxa with purely aquatic juveniles. 1 = aquatic, 0 = not aquatic",
"Binary value indicating if the sampled individual came from water (amphipoda, gastropoda)",
"Length of the sampled individual in mm (rounded to full numbers). Measured from head to end of abdomen, excluding appendages (wings, limbs, antennae, cerci, etc.)",
"Additional comments or information about a given individual"
)
)
kable(data_dict, align = c("l","l"), caption = "Metadata for data_arthropods_flying.xlsx")| Column_name | Explanation |
|---|---|
| site | Study system in which the individual was sampled |
| location | Site at which individual was sampled (i.e. pool or control) |
| date | Collection period during which the individual was sampled (i.e. June or July) |
| class | Taxonomic class the sampled individual belongs to |
| order | Taxonomic order the sampled individual belongs to |
| suborder | Taxonomic suborder the sampled individual belongs to (where available) |
| family | Taxonomic family the sampled individual belongs to (where available) |
| subfamily | Taxonomic subfamily the sampled individual belongs to (where available) |
| juvenile | Binary value indicating if the sampled individual was a juvenile. 1 = juvenile, 0 = adult |
| terrestrial | Binary value indicating if the sampled individual was winged. 1 = non-winged, 0 = winged |
| juv aquatic | Binary value indicating if the sampled individual belongs to a taxa with purely aquatic juveniles. 1 = aquatic, 0 = not aquatic |
| unusable | Binary value indicating if the sampled individual came from water (amphipoda, gastropoda) |
| size | Length of the sampled individual in mm (rounded to full numbers). Measured from head to end of abdomen, excluding appendages (wings, limbs, antennae, cerci, etc.) |
| remarks | Additional comments or information about a given individual |
Question:
How many taxa which ones are there?
# Get unique families (excluding NA)
families <- unique(na.omit(d$family))
# Number of unique families
num_families <- length(families)
# Print results
cat("Number of unique families:", num_families, "\n\n")Number of unique families: 29
cat("Families:\n")Families:
print(sort(families)) [1] "Aphidoidea" "Cantharidae" "Carabidae" "Chrysomelidae"
[5] "Coccinellidae" "Cucujidae" "Curculionidae" "Dytiscidae"
[9] "Elmidae" "Erebidae" "Forficulidae" "Formicidae"
[13] "Gerridae" "Gyrinidae" "Haliplidae" "Hydrophilidae"
[17] "Latridiidae" "Monotomidae" "Mordellidae" "Nitidulidae"
[21] "Notonectidae" "Panorpidae" "Phalacridae" "Psylloidea"
[25] "Scirtidae" "Staphylinidae" "Syrphidae" "Tabanidae"
[29] "Vespidae"
Description of the families we have found:
# Families Found Near Streams in Switzerland
families_table <- data.frame(
Family = c(
"Aphidoidea", "Cantharidae", "Carabidae", "Chrysomelidae",
"Coccinellidae", "Cucujidae", "Curculionidae", "Dytiscidae",
"Elmidae", "Erebidae", "Forficulidae", "Formicidae",
"Gerridae", "Gyrinidae", "Haliplidae", "Hydrophilidae",
"Latridiidae", "Monotomidae", "Mordellidae", "Nitidulidae",
"Notonectidae", "Panorpidae", "Phalacridae", "Psylloidea",
"Scirtidae", "Staphylinidae", "Syrphidae", "Tabanidae",
"Vespidae"
),
Description = c(
"Aphids; plant sap-feeders, often found on riparian vegetation.",
"Soldier beetles; predatory or nectar-feeding, common in meadows near water.",
"Ground beetles; many species are predators along stream banks.",
"Leaf beetles; herbivores on riparian plants.",
"Lady beetles; mostly aphid predators on vegetation.",
"Flat bark beetles; live under bark, sometimes in moist riparian wood.",
"Weevils; herbivores feeding on riparian plants and shrubs.",
"Predaceous diving beetles; aquatic predators in streams and ponds.",
"Riffle beetles; aquatic, live attached to stones in running water.",
"Tiger moths and relatives; larvae feed on diverse plants near water.",
"Earwigs; omnivores hiding under stones and wood along streams.",
"Ants; common in soils and vegetation along riparian zones.",
"Water striders; aquatic predators skating on water surfaces.",
"Whirligig beetles; fast swimmers on water surfaces in streams.",
"Crawling water beetles; small herbivorous beetles in shallow water.",
"Water scavenger beetles; aquatic or semi-aquatic scavengers.",
"Minute brown scavenger beetles; found in decaying plant matter.",
"Root-eating beetles; often associated with decaying wood.",
"Tumbling flower beetles; found on flowers near riparian habitats.",
"Sap beetles; feed on decaying fruit, fungi, and plant material.",
"Backswimmers; aquatic predators that swim upside down.",
"Scorpionflies; scavengers, often in damp shaded stream habitats.",
"Shining flower beetles; small pollen feeders.",
"Psyllids; plant sap-feeders, often on riparian trees and shrubs.",
"Marsh beetles; aquatic or semi-aquatic beetles in wetlands.",
"Rove beetles; very diverse predators and scavengers in moist habitats.",
"Hoverflies; larvae are aphid predators, adults visit flowers.",
"Horse flies; adults feed on blood or nectar, larvae in wet soils.",
"Wasps; diverse group of predators and parasitoids near water."
)
)
kable(families_table, caption = "Ecological roles of arthropod families sampled close to streams in Switzerland")| Family | Description |
|---|---|
| Aphidoidea | Aphids; plant sap-feeders, often found on riparian vegetation. |
| Cantharidae | Soldier beetles; predatory or nectar-feeding, common in meadows near water. |
| Carabidae | Ground beetles; many species are predators along stream banks. |
| Chrysomelidae | Leaf beetles; herbivores on riparian plants. |
| Coccinellidae | Lady beetles; mostly aphid predators on vegetation. |
| Cucujidae | Flat bark beetles; live under bark, sometimes in moist riparian wood. |
| Curculionidae | Weevils; herbivores feeding on riparian plants and shrubs. |
| Dytiscidae | Predaceous diving beetles; aquatic predators in streams and ponds. |
| Elmidae | Riffle beetles; aquatic, live attached to stones in running water. |
| Erebidae | Tiger moths and relatives; larvae feed on diverse plants near water. |
| Forficulidae | Earwigs; omnivores hiding under stones and wood along streams. |
| Formicidae | Ants; common in soils and vegetation along riparian zones. |
| Gerridae | Water striders; aquatic predators skating on water surfaces. |
| Gyrinidae | Whirligig beetles; fast swimmers on water surfaces in streams. |
| Haliplidae | Crawling water beetles; small herbivorous beetles in shallow water. |
| Hydrophilidae | Water scavenger beetles; aquatic or semi-aquatic scavengers. |
| Latridiidae | Minute brown scavenger beetles; found in decaying plant matter. |
| Monotomidae | Root-eating beetles; often associated with decaying wood. |
| Mordellidae | Tumbling flower beetles; found on flowers near riparian habitats. |
| Nitidulidae | Sap beetles; feed on decaying fruit, fungi, and plant material. |
| Notonectidae | Backswimmers; aquatic predators that swim upside down. |
| Panorpidae | Scorpionflies; scavengers, often in damp shaded stream habitats. |
| Phalacridae | Shining flower beetles; small pollen feeders. |
| Psylloidea | Psyllids; plant sap-feeders, often on riparian trees and shrubs. |
| Scirtidae | Marsh beetles; aquatic or semi-aquatic beetles in wetlands. |
| Staphylinidae | Rove beetles; very diverse predators and scavengers in moist habitats. |
| Syrphidae | Hoverflies; larvae are aphid predators, adults visit flowers. |
| Tabanidae | Horse flies; adults feed on blood or nectar, larvae in wet soils. |
| Vespidae | Wasps; diverse group of predators and parasitoids near water. |
# Alternative: side-by-side histogram (use 'dodge')
ggplot(d, aes(x = size, fill = location)) +
geom_histogram(binwidth = 1, color = "black", alpha = 0.7, position = "dodge") +
labs(
title = "Histogram of individual sizes by sampled location",
x = "Size (mm)",
y = "Frequency",
fill = "Location"
) +
theme_bw(base_size = 14)# with stat halfeye
ggplot(d, aes(y = location, x = size, fill = location)) +
stat_halfeye(position = "dodge",
adjust = 1, # smoothness of density
width = 0.6, # width of half-eye
justification = -0.1,
point_interval = mean_qi, # show mean & 95% interval
alpha = 0.7
) +
labs(
title = "Size Distributions by Location",
y = "Location",
x = "Size (mm)"
) +
theme_bw(base_size = 14)The macrozoobenthos data we are going to use was collected by Ph.D student Valentin Moser and ETH Master student Dominic Tinner within the WSL and Eawag project: Species interactions in beaver engineered habitats link land-water ecosystem processes. They sampled 16 streams with beaver presence (8 streams in 2021 and 8 streams in 2022) across Switzerland (Figure 3). The streams varied in surrounding landscape (open landscape or forest), beaver pond area, and stream ecomorphology, i.e., the structural stream characteristics (Supplementary Table Y). The classification of ecomorphology index considered factors such as riparian zone modifications and structural alterations of stream beds to provide an estimate of the degree of anthropogenic impact. The streams included in this study belonged to the first three of the five ecomorphology categories, ranging from near-natural (category level 1) to slightly impacted (2) and heavily impacted (3). The values for ecomorphology were retrieved for each stream from the Swiss Geoportal, specifically the map layer ’Ecomorphology Level F – River reaches‘ (link to access).
This data was collected by Ph.D student Valentin Moser and Master students in 2021 and 2022 within 16 different streams in Switzerland. They sampled terrestrial arthropods by using a 5 x 1 m plot located one meter from the stream’s edge in the centre of the beaver pool and control area. Within each plot, they sampled the arthropods at the two ends of the 5 x 1 m plot in cylindrical baskets (50 cm diameter, 67 cm height, woven fabric) using suction sampling on a sunny day between 10:00-17:00 during peak arthropod activity. The samples were stored in ethanol, individuals were counted, measured and identified to order level with the help of a binocular.
t21 = readxl::read_xlsx(path = here("data", "raw","data_arthropods_terrestrial_2021.xlsx"), sheet = 1) |> mutate(year = 2021) |> rename(laufnummer = Laufnummer) |> select(-c(remarks,adult))
# load site info
site_info <- read_csv(here("data", "raw" , "site_info.csv")) |> select (c(2:5,8,9,19,23))
# check common columns
intersect(names(t21), names(site_info))[1] "laufnummer"
# meerge site info to t21 df
t21_merged <- t21 %>%
left_join(site_info, by = "laufnummer") |>
select(latitude_sample, longitude_sample,laufnummer, year, site, sample,location, ecomorphology, area_pool, everything())
# To see rows with NA in size
t21_merged %>% filter(is.na(size))# A tibble: 1 × 15
latitude_sample longitude_sample laufnummer year site sample location
<dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 47.6 9.18 72 2021 Logge Outflow_3 Outflow
# ℹ 8 more variables: ecomorphology <dbl>, area_pool <dbl>,
# samples_replicate <chr>, class <chr>, order <chr>, suborder <chr>,
# family <chr>, size <dbl>